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ZINC FINGER PROTEIN DERIVATIVES AND METHODS THEREFOR 
BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

This invention relates generally to the field of regulation of gene expression and 
5 specifically to methods of modulating gene expression by utilizing polypeptides derived 
from zinc finger-nucleotide binding proteins. 

2. Description of Related Art 

Transcriptional regulation is primarily achieved by the sequence-specific binding of 
proteins to DNA and RNA. Of the known protein motifs involved in the sequence 

10 specific recognition of DNA, the zinc finger protein is unique in its modular nature. To 
date, zinc finger proteins have been identified which contain between 2 and 37 modules. 
More than two hundred proteins, many of them transcription factors, have been shown 
to possess zinc fingers domains. Zinc fingers connect transcription factors to their target 
genes mainly by binding to specific sequences of DNA base pairs - the "rungs" in the 

15 DNA "ladder". 

Zinc finger modules are approximately 30 amino acid-long motifs found in a wide 
variety of transcription regulatory proteins in eukaryotic organisms. As the name 
implies, this nucleic acid binding protein domain is folded around a zinc ion. The zinc 
finger domain was first recognized in the transcription factor TFHIA from Xenopus 
20 oocytes (Miller, et al t EMBO, 4:1609-1614, 1985; Brown, et al, FEBSLett., 186:271- 
274, 1 985). This protein consists of nine imperfect repeats of a consensus sequence: 

(Tyr, Phe^X^ys-X^-Cys^-Phe-Xj-Uu-Xj-ffis^X^-ffis-X^ (SEQ ID 
NO: 1) 
where X is any amino acid. 
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Like TH11A, most zinc finger proteins have conserved cysteine and histidine residues 
that tetrahedrally-coordinate the single zinc atom in each finger domain. The structure 
of individual zinc finger peptides of this type (containing two cysteines and two 
histidines) such as those found in the yeast protein ADR1, the human male associated 
5 protein ZFY, the HIV enhancer protein and the Xenopus protein Xfin have been solved 
by high resolution NMR methods (Kochoyan, et al, Biochemistry, 30:3371-3386, 1991 ; 
Omichinski, etal., Biochemistry, 29:9324-9334, 1990; Lee, et al., Science, 245:635-637, 
1989) and detailed models for the interaction of zinc fingers and DNA have been 
proposed (Berg, 1988; Berg, 1990; Churchill, et al, 1990). Moreover, the structure of 

10 a three finger poIypeptide-DNA complex derived from the mouse immediate early 
protein zif268 (also known as Krox-24) has been solved by x-ray crystallography 
(Pavletich and Pabo, Science, 252:809-8 1 7, 1 99 1 ). Each finger contains an antiparallel 
P-turn, a finger tip region and a short amphipathic a-helix which, in the case of zi£268 
zinc fingers, binds in the major groove of DNA. In addition, the conserved hydrophobic 

15 amino acids and zinc coordination by the cysteine and histidine residues stabilize the 
structure of the individual finger domain. 

While the prototype zinc finger protein TFHIA contains an array of nine zinc fingers 
which binds a 43 bp sequence within the 5S RNA genes, regulatory proteins of the zr£268 
class (Krox-20, Spl, for example) contain only three zinc fingers within a much larger 

20 polypeptide. The three zinc fingers of zd£268 each recognize a 3 bp subsite within a 9 bp 
recognition sequence. Most of the DNA contacts made by zi£268 are with phosphates 
and with guanine residues on one DNA strand in the major groove of the DNA helix. In 
contrast, the mechanism of TFIHA binding to DNA is more complex. The amino- 
terminal 3 zinc fingers recognize a 13 bp sequence and bind in the major groove. Similar 

25 to zi£268, these fingers also make guanine contacts primarily on one strand of the DNA. 
Unlike the zif268 class of proteins, zinc fingers 4 and 6 of TFIHA each bind either in or 
across the minor groove, bringing fingers 5 and 7 through 9 back into contact with the 
major groove (Clemens, etal, Proc. Natl Acad. Scl USA, 89:10822-10826, 1992). 
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The crystal structure of zif268, indicates that specific histidine (non-zinc coordinating 
his residues) and arginine residues on the surface of the a-helix participate in DNA 
recognition. Specifically, the charged amino acids immediately preceding the a-helix 
and at helix positions 2, 3, and 6 (immediately preceding the conserved histidine) 
5 participate in hydrogen tending to DNA guanines. Similar to finger 2 of the regulatory, 
protein Krox-20 and fingers 1 and 3 of Spl, finger 2 of TFIIIA contains histidine and 
arginine residues at these DNA contact positions; further, each of these zinc fingers 
minimally recognizes the sequence GGG. Finger swap experiments between 
transcription factor Spl and Krox-20 have confirmed the 3-bp zinc finger recognition 

10 code for this class of finger proteins (Nardelli, et al, Nature, 349:175-178, 1989). 
Mutagenesis experiments have also shown the importance of these amino acids in 
specifying DNA recognition. It would be desirable to ascertain a simple code which 
specifies zinc finger-nucleotide recognition. If such a code could be deciphered, then 
zinc finger polypeptides might be designed to bind any chosen DNA sequence. The 

1 5 complex of such a polypeptide and its recognition sequence might be utilized to modulate 
(up or down) the transcriptional activity of the gene containing this sequence. 

Zinc finger proteins have also been reported which bind to RNA. Clemens, et al. 9 
(Science, 260:530, 1993) found that fingers 4 to 7 of TFIIIA contribute 95% of the free 
energy of TFIIIA binding to 5S rRNA, whereas fingers 1 to 3 make a similar contribution 
20 in binding the promoter of the 5S gene. Comparison of the two known 5S RNA binding 
proteins, TFIIIA and p43, reveals few homologies other than the consensus zinc ligands 
(C and H), hydrophobic amino acids and a threonine-tryptophan-threonine triplet motif 
in finger 6. 

In order to redesign zinc fingers, new selective strategies must be developed and 
25 additional information on the structural basis of sequence-specific nucleotide recognition 
is required. Current protein engineering efforts utilize design strategies based on 
sequence and/or structural analogy. While such a strategy may be sufficient for the 
transfer of motifs, it limits the ability to produce novel nucleotide binding motifs not 



